Cleansing and preparation of data for statistical analysis: A step necessary in oral health sciences research

Authors

  • Ali Akbar Haghdoost Professor, Research Center for Modeling in Health, Institute of Futures Studies in Health, Kerman University of Medical Sciences, Kerman, Iran
  • Arash Shahravan Professor, Endodontology Research Center AND Oral and Dental Diseases Research Center AND Kerman Social Determinants on Oral Health Research ‎‎Center, Kerman university of Medical Sciences, Kerman, Iran
  • Hossein Molavi Vardajani Assistant Professor, Department of MPH, School of Medicine, Shiraz University of Medical Sciences, Shiraz, Iran
  • Maryam Rad Assistant Professor, Oral and Dental Diseases Research Center, Kerman university of Medical Sciences, Kerman, Iran
Abstract:

In many published articles, there is still no mention of quality control processes, which might be an indication of the insufficient importance the researchers attach to undertaking or reporting such processes. However, quality control of data is one of the most important steps in research projects. Lack of sufficient attention to quality control of data might have a detrimental effect on the results of research studies. Therefore, directing the attention of researchers to quality control of data is considered a step necessary to promote the quality of research studies and reports. We have made an attempt to define the processes of cleansing and preparing data and determine its position in research protocols. An algorithm was presented for cleansing and preparing data. Then, the most important potential errors in data were introduced by giving some examples, and their effects on the results of studies were demonstrated. We made attempts to introduce the most important reasons behind errors of different natures; the techniques used to identify them and the techniques used to prevent or rectify them. Subsequently, the procedures used to prepare the data were dealt with. In this section, techniques were introduced which are used to manage the relationships established between the premises of statistical models before carrying out analyses. Considering the widespread use of statistical models with the premise of normality, such premises were focused on. Techniques used to identify lack of normal distribution of data and methods used to manage them were presented. Cleansing and preparation of data can have a significant effect on promotion of quality and accuracy of the results of research studies. It is incumbent on researchers to recognize techniques used to identify, reasons for occurrence, methods to prevent or rectify different kinds of errors in data, learn appropriate techniques in this context and mention them in study reports.

Upgrade to premium to download articles

Sign up to access the full text

Already have an account?login

similar resources

a contrastive analysis of concord and head parameter in english and azerbaijani

این پایان نامه به بررسی و مقایسه دو موضوع مطابقه میان فعل و فاعل (از نظر شخص و مشار) و هسته عبارت در دو زبان انگلیسی و آذربایجانی می پردازد. اول رابطه دستوری مطابقه مورد بررسی قرار می گیرد. مطابقه به این معناست که فعل مفرد به همراه فاعل مفرد و فعل جمع به همراه فاعل جمع می آید. در انگلیسی تمام افعال، بجز فعل بودن (to be) از نظر شمار با فاعلشان فقط در سوم شخص مفرد و در زمان حال مطابقت نشان میدهند...

15 صفحه اول

the innovation of a statistical model to estimate dependable rainfall (dr) and develop it for determination and classification of drought and wet years of iran

آب حاصل از بارش منبع تأمین نیازهای بی شمار جانداران به ویژه انسان است و هرگونه کاهش در کم و کیف آن مستقیماً حیات موجودات زنده را تحت تأثیر منفی قرار می دهد. نوسان سال به سال بارش از ویژگی های اساسی و بسیار مهم بارش های سالانه ایران محسوب می شود که آثار زیان بار آن در تمام عرصه های اقتصادی، اجتماعی و حتی سیاسی- امنیتی به نحوی منعکس می شود. چون میزان آب ناشی از بارش یکی از مولفه های اصلی برنامه ...

15 صفحه اول

A statistical test for outlier identification in data envelopment analysis

In the use of peer group data to assess individual, typical or best practice performance, the effective detection of outliers is critical for achieving useful results. In these ‘‘deterministic’’ frontier models, statistical theory is now mostly available. This paper deals with the statistical pared sample method and its capability of detecting outliers in data envelopment analysis. In the prese...

full text

a time-series analysis of the demand for life insurance in iran

با توجه به تجزیه و تحلیل داده ها ما دریافتیم که سطح درامد و تعداد نمایندگیها باتقاضای بیمه عمر رابطه مستقیم دارند و نرخ بهره و بار تکفل با تقاضای بیمه عمر رابطه عکس دارند

A Realistic Data Cleansing and Preparation Project

Although data cleansing and preparation are significant tasks in many real-world data projects, they are rarely found in project assignments in IS database courses. This paper describes a pilot study of a relatively open-ended project assignment in a graduate database course. The project required the students to cleanse and prepare five datasets on educational statistics from United Nations Dat...

full text

My Resources

Save resource for easier access later

Save to my library Already added to my library

{@ msg_add @}


Journal title

volume 5  issue 4

pages  171- 185

publication date 2016-12-05

By following a journal you will be notified via email when a new issue of this journal is published.

Hosted on Doprax cloud platform doprax.com

copyright © 2015-2023